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Abstract 

Predictive statistical mechanics is a form of inference from available data, without additional assumptions, 
for predicting reproducible phenomena. By applying it to systems with Hamiltonian dynamics, a problem 
of predicting the macroscopic time evolution of the system in the case of incomplete information about the 
microscopic dynamics was considered. In the model of a closed Hamiltonian system (i.e. system that can 
exchange energy but not particles with the environment) that with the Liouville equation uses the concepts 
of information theory, analysis was conducted of the loss of correlation between the initial phase space paths 
and final microstates, and the related loss of information about the state of the system. It is demonstrated 
that applying the principle of maximum information entropy by maximizing the conditional information 
entropy, subject to the constraint given by the Liouville equation averaged over the phase space, leads to a 
definition of the rate of change of entropy without any additional assumptions. In the subsequent paper [1] 
this basic model is generalized further and brought into direct connection with the results of nonequilibrium 
theory. 
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I. INTRODUCTION 


Foundations of predictive statistical mechanics were formulated by E. T. Jaynes in his well know 
papers [2, 3]. There he gave a full development of the results of equilibrium statistical mechan¬ 
ics and the formalism of Gibbs [4] as a form of statistical inference based on Shannon’s concept 
of a measure of information [5]. Shannon’s measure of information is also known as information 
entropy, and in the interpretation given by Jaynes it is the correct measure of the “amount of 
uncertainty” represented by a probability distribution [6]. Maximization of information entropy 
subject to given constraints is a central concept in Jaynes’ approach known as the principle of max¬ 
imum information entropy. Application of this principle allows the construction of a probability 
distribution which includes in the distribution only information represented by given constraints, 
without any additional assumptions. Jaynes’ approach is based on the Gibbs’ formalism of sta¬ 
tistical mechanics, which Jaynes considered to represent a general method of statistical inference 
in different problems where available information is not complete [7]. This includes equilibrium 
statistical mechanics [2] and the formulation of a theory of irreversibility [3], that Jaynes tried to 
accomplish in his later works [6-10]. 

Predictions and calculations for different irreversible processes usually involve three distinct 
stages [7]: 

(1) Setting up an “ensemble”, i.e., choosing an initial density matrix, or in our case an Ai-particle 
distribution, which is to describe our initial knowledge about the system of interest; 

(2) Solving the dynamical problem; i.e., applying the microscopic equations of motion to obtain 
the time evolution of the system; 

(3) Extracting the final physical predictions from the time developed ensemble. 

As fully recognized by Jaynes, the availability of the general solution of stage (1) simplifies the 
complicated stage (2). The problem includes also an equally important stage (0) consisting of 
some kind of measurement or observation defining both the system and the problem [11]. In direct 
mathematical attempts that lead to a theory of irreversibility, the Liouville theorem with the 
conservation of phase space volume inherent to Hamiltonian dynamics, is often represented as one 
of the main difficulties. Relation of the Liouville equation and irreversible macroscopic behavior is 
one of the central problems in statistical mechanics. For that reason this extremely complicated 
equation is reduced to an irreversible equation known as Boltzmann equation, rate equation or 
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master equation. On the other hand, Jaynes considers the Liouville equation and the related 
constancy in time of Gibbs’ entropy as precisely the dynamical property needed for solution of this 
problem, considering it to be more of conceptual than mathematical nature [6, 8]. In the simple 
demonstration based on the Liouville theorem, this makes possible for Jaynes to generalize the 
second law beyond the restrictions of initial and final equilibrium states, by considering it a special 
case of a general restriction on the direction of any reproducible process [8, 12]. The real reason 
behind the second law, since phase space volume is conserved in the dynamical evolution, is a 
fundamental requirement on any reproducible process that the phase space volume W' ^ compatible 
with the final (macroscopic) state, can not be less than the phase space volume Wq which describes 
our ability to reproduce the initial state [8]. 

Mathematical clarity of Jaynes’ viewpoint has its basis in a limit theorem noted by Shannon 
[5], known as the asymptotic equipartition theorem of information theory. Application of this the¬ 
orem relates in certain cases, in a limit of large number of particles, the Boltzmann’s formula for 
entropy of a macrostate and the Gibbs expression for entropy [7, 8, 10]. Mathematical connection 
with the Boltzmann’s interpretation of entropy as the logarithm of the number or ways (or mi¬ 
crostates) by which a macroscopic state can be realized, gives then a simple physical interpretation 
to the Gibbs’ formalism, and its generalization in the maximum-entropy formalism. Maximization 
of the information entropy subject to given constraints then predicts the macroscopic behavior 
that can happen in the greatest number of ways compatible with the information represented by 
given constraints. In application to time dependent processes, this is referred to by Jaynes as 
the maximum caliber principle [9, 10]. Jaynes clearly stated that predictive statistical mechanics 
does not represent a physical theory that explains the behavior of different systems by deductive 
reasoning from the first principles, but a form of statistical inference that makes predictions of 
observable phenomena from incomplete information [9]. For this reason predictive statistical me¬ 
chanics can not claim certainty for its predictions in the way that a deductive theory can. This 
does not mean that predictive statistical mechanics ignores the laws of microphysics; it certainly 
uses everything known about the structure of microstates and any data on macroscopic quantities, 
without making any extra physical assumptions beyond what is given by available information. It 
is important to note that sharp, definite predictions of macroscopic behavior are possible only when 
certain behavior is characteristic of each of the overwhelming majority of microstates compatible 
with data. For the same reason, this is just the behavior that is reproduced experimentally under 
those constraints; this is known essentially as the principle of macroscopic uniformity [2, 3], or 
the principle of macroscopic reproducibility [10]. In somewhat different context this property is 
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recognized as the concept of macroscopic determinism, whose precise dehnition involves some sort 
of thermodynamic limit [13, 14]. The second law of thermodynamics predicts only that a change 
of macroscopic state will go in the general direction of greater hnal entropy, but not at which rate, 
or along which path [9, 10, 12]. It is clear that better predictions are possible only by introducing 
more information. Macrostates of higher entropy can be realized in overwhelmingly more ways, 
and this is the basic reason for high reliability of the Gibbs equilibrium predictions [10]. In this 
context, Jaynes also speculated that accidental success in the reversal of an irreversible process is 
exponentially improbable [12]. 

Jaynes’ interpretation of irreversibility and the second law reflects the point of view of the 
actual experimenter. Zurek [15] has proposed the dehnition of physical entropy as the sum of the 
missing information about the microscopic state, given by Shannon’s information entropy, and the 
algorithmic information content present in the available data about the system. In the limit of 
Zurek’s approach in which measurement is complete and the microstate is known, physical entropy 
of the system is given by the algorithmic information content about the microscopic state in which 
the system is found [15]. Zurek’s interpretation of the physical entropy and thermodynamics is 
given at the level of observers that can acquire information through measurements and process 
it in accordance with the basic laws of computation in a manner analogous to Turing machines. 
Jaynes has maintained the position that measurements [3] in practice always represent far less 
than the maximum observation which would enable us to determine a dehnite pure state (i.e. the 
microscopic state of the system). This is the reason why [3] we must have recourse to maximum- 
entropy inference in order to represent our degree of knowledge about the system in a way free of 
arbitrary assumptions with regard to missing information. 

MaxEnt algorithm is a general method of constructing the probability distribution by apply¬ 
ing the principle of maximum information entropy in cases when distribution is not determined 
uniquely by available information. Arbitrary assumptions can be avoided by selecting the proba¬ 
bility distribution which is compatible with the available information, and which is characterized 
by largest uncertainty related to missing information. Inferences drawn from such probability dis¬ 
tribution depend only on a real degree of knowledge [2, 3]. Probability distribution that maximizes 
the information entropy (uncertainty) subject to constraints given by available macroscopic data, 
in predictive statistical mechanics represents real uncertainty related to missing information about 
the actual microscopic state of the system. 

In a similar line of reasoning Grandy [16, 17, 19] has developed a detailed model of time de¬ 
pendent probabilities and density matrix for macroscopic systems with time dependent constraints 
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within the MaxEnt formalism, and applied it to typical processes in nonequilibrium thermodynam¬ 
ics and hydrodynamics [18, 19]. In a context of the interplay between macroscopic constraints on 
the system and its microscopic dynamics, it is interesting to note that MaxEnt formalism has been 
also studied as a method of approximately solving partial differential equations governing the time 
evolution of probability distributions. Eor more complete further reference, we only mention here 
that this method, among other examples, has been applied to the Liouville-von Neumann equation 
[20], the family of dynamical systems with divergenceless phase space flows including Hamiltonian 
systems [21], the generalized Liouville equation and continuity equations [22]. Universality of 
this approach has been established for the general class of evolution equations that conform to 
the essential requirements of linearity and preservation of normalization [23]. This method has 
been also considered for classical evolution equations with source terms within a framework where 
normalization is not preserved [24]. 

In this and in the subsequent paper [1] we consider the application of predictive statistical 
mechanics on the problem of predicting the macroscopic time evolution of systems with Hamiltonian 
dynamics, in the case when the information about the microscopic dynamics of the system is not 
complete. Eor this purpose we have developed a basic theoretical model for a closed system with 
Hamiltonian dynamics. Concepts of Hamiltonian mechanics and probability distributions in the 
phase space applied in this model are defined in Sections H and HI. In Section IV information 
entropies that correspond to those probability distributions are defined. The model is set and its 
results are analyzed in Section V. Results that have already been presented in [25] were obtained 
in a model of a closed system with the time independent Hamiltonian function. In this paper, we 
have included in this basic model also closed systems with Hamiltonian function that depends on 
time. Conclusions based on these results are presented in Section VI. They are the basis for further 
generalization of this basic theoretical model in the subsequent paper [1], where it is brought in 
direct connection with the results of the nonequilibrium theory. 

II. HAMILTONIAN DYNAMICS AND PHASE SPACE PATHS 

The dynamical state of a Hamiltonian system with s degrees of freedom is described by the 
generalized coordinates qi,q 2 , ■ ■ ■ ,qs and their conjugate momenta pi,p2, ■ ■ ■ ,Ps- At any time t it 
is represented by a point in the 2s-dimensional space T called the phase space of the system. The 
notation {q,p) = {qi,q2, ■ ■ ■ ,qs,Pi,P2, ■ ■ ■ ,Ps) is introduced for the set of generalized coordinates 
and conjugate momenta forming together 2s coordinates of the phase space T. The time dependence 
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of 2 s dynamical variables {q,p) is determined by Hamilton’s equations 


= p^ = -ir^ i = i,2,...,s, (1) 

opi dqi 

where H = H{q,p,t) is the Hamiltonian function of the system. For the given values ((70)Po) at 
some time to, Hamilton’s equations (1) and its solution uniquely determine the values of dynamical 
variables {q,p) at any other time t: 


qi = qi{t-,qo,Po), 


Pi = Pi(t;qo,Po), 




Hence, a point {q,p) in the phase space F representing the state of the system describes over time 
a curve called a phase space path, uniquely determined by the solution of (1). The set 12 (F) is the 
set of all phase space paths in F. At time t through the point {q,p)ui G F passes only one path 
u G f2(r), and this is denoted by the index in {q,p)ui, where u G n(r). The velocity v of the point 
{q,p) in the phase space F corresponding to values of dynamical variables at time t is given by 


n = V = 




^ \dpi) v%y 


The velocity vector v{{q,p)i^,t) is tangential at the point {q,p)u) G F to the phase space path u 
passing through it at time t. This defines the velocity vector field 'v{{q,p),t) on F. 

III. MICROSTATE PROBABILITY AND PATH PROBABILITY 

It is now possible to relate the microstate probability and the path probability in the phase 
space F of the system. Let the function f{q,p, t) be a microstate probability density function on F. 
All points in the phase space F move according to Hamilton’s equations (1) and f{q,p,t) satisfies 
the Liouville equation 

^\dQidPi dpidqij dt 

Since df /dt is a total or hydrodynamic derivative, (4) expresses that the time rate of change of 
f{q,p,t) is zero along any phase space path given by the solution of Hamilton’s equations. In the 
notation used here, this fact is written as 

f{{q,p)uj,t) = f{{qo,Po)uj,to), (5) 


where points on the path uj G 12(r) are related by (2). 


6 



In order to relate the microstate probability and the path probability in the phase space F, 
probability density function T{q,p,t-, qo,po,to) is introduced on the 4s-dimensional space F x F. 
This function has the following special properties. If the integral of J^{q,p,t‘,qo,po,to) is taken 
over the phase space F with {q,p) as the integration variables, it gives the microstate probability 
density function f{qo,Po,to) at time to, 

f{qo,Po,to) = j^T{q,p,t;qo,Po,to)dr. (6) 

Microstate probability density function f{q,p,t) at time t is obtained analogously, 

f{q,P,t) = j^J^{q,p,t;qo,Po,to)dTo. (7) 

It is straightforward to prove, using relation (5), that (6) and (7) are satisfied if the function 
'ZOjPoTo) has the following form: 

5 

J^{q,P,t;qo,Po,to) = f{q,P,t)Y\5{qi - qi{t;qo,po))5{pi - Pi{t;qo,Po)), (8) 

i=l 

where qi{t;qo,Po) and Pi{t;qo,po) are given by (2) and 5-s are Dirac delta functions. In the space 
F X F function F{q,p,t;qo,Po-,to) given by (8) represents the probability density that the point 
corresponding to the state of the system is in the element dFo around the point {qo,Po) at time to 
and in the element dF around the point {q,p) at time t. 

A. Time independent Hamiltonian function 

Now, we assume that the set M of all points in F that represent possible microstates of the 
system is invariant to Hamiltonian motion. We also assume that the Hamiltonian function does not 
depend on time H = H{q,p) . The invariance of the measure dF to Hamiltonian motion and the 
fact that the velocity field v{q,p) in F is stationary as the Hamiltonian function does not depend 
on time, lead to the following consequence. For any phase space path uj G D(F), the product of 
the velocity v{{q,p)u)) and the infinitesimal element dS"^ of the surface intersecting the path uj 
perpendicularly at the point {q,p)uj, is constant under Hamiltonian motion along the entire length 
of the path uj, i.e., 

v{{q,p)ui)dSui = const. (9) 

For any two points {qo-,PQ)u) and {qa,Pa)ui on the same path w, the following relation is obtained 
from (9): 

v{iqO,Po)ui)dSoui = v{{qa,Pa)uj)dSauj- (10) 
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The infinitesimal element dSo^^ of the surface So{M) intersects the path u perpendicularly at the 
point {qo,Po)ui- The surface So{M) is perpendicular to all paths in the set Vl{M) of paths in M. 
The infinitesimal element dSa^o of the surface Sa{M) intersects the path uj perpendicularly at the 
point {qa,Pa)oj- Like the surface So{M), surface Sa{M) is also perpendicular to all paths in 
The infinitesimal elements dSo^) and dSau of the two surfaces So{M) and Sa{M) are connected by 
the path uj and neighboring paths determined by solutions of Hamilton’s equations. The integral 
over surface Sa{M) is transformed using (10) into integration over surface So{M), 


dSau, = 


vi{qo,Po)u 


:dSo^. 


( 11 ) 


JSa{M) Jso(M) 'u{{qa,Pa)u 

Functional dependence between the points {qo,Po)Lo and {qa-,Pa)ui on the path a; is not explicitly 
written in the integral (11); it is implied that this functional dependence is determined from 
solutions of Hamilton’s equations and the additional condition of perpendicularity of the surfaces 
So{M) and Sa{M) to all paths in 0,(M). It is important to emphasize that perpendicularity of 
the surfaces Sq{M) and Sa{M) to all phase space paths in Q{M) is implied by the definition of 
these surfaces and not as a consequence of Hamiltonian time evolution. It is also clear that the 
measure defined on the surface So{M) can be utilized as a measure on the set II(M) of all phase 
space paths in some invariant set M. The correspondence between points {qo,Po)ui G <S'o(M) and 
paths UJ G VL{M) is one-to-one. 

The infinitesimal volume element dVa around the point {qa,Pa)ui through which the path uj G 
Q,{M) passes can be written as dTa = dsauidSau- Here, dsaui is the infinitesimal distance along the 
path UJ, i.e. the infinitesimal arc length element of the path uj. The integral (7) can now be written 
as 


f{q,P,t) = J ^{q,P,t]qa,Pa,to)dSaujdSaui 

= [ dSouv{{qo,Po)uj) [ dsauj, (12) 

J So(M) Ju) V[qa,Pa) 

where in the first line dTa = dsau)dSaoj (with dummy indices) is introduced and in the second 
line the integral is transformed in accordance with (11). Along with the lines leading to (12), the 
function G{q,p,t; {qo,Po)u},to) is also introduced: 


G{q,p,t; {qo,po) to) 


v{{qo,po)u 


T{q,p,t-,qa,Pa,to) 

v{qa,Pa) 


dSn 


(13) 


The integral in the definition of G{q,p,t] {qo,Po)ui,to) in (13) is over the entire length of the phase 
space path uj intersected perpendicularly by the surface 5'o(M) at the point {qo,Po)ui- Using (13), 



relation (12) is then written as 


f{q,P,t)= G{q,p,t;{qo,po)uj,to)dSo. (14) 

JSoiM) 

It is clear that the expression 

G{q,p,t; {qo,po)co,to)dSodr = dP{q,p,tr\ {qo,Po)uj,to), (15) 

represents the probability that the point corresponding to the state of the system at time to is 
anywhere along the paths which pass through an infinitesimal element dSo around {qo,Po) on the 
surface S'o(M), and that at some different time t it is in the volume element dF around {q,p). 
Therefore, G{q,p,t; {qo,po)co,to) is & joint probability density of two continuous multidimensional 
variables, {q,p) in T and {qo,Po)cL) in So{M). 

With (15) and the definition of the joint density G{q,p, t; {qo,Po)Lo,to)i the definition of the path 
probability density F{{qo,pQ)^,to) is now straightforward. It is given by the integral 


F{iqo,Po)uj,to) = j^G{q,p,t; {qo,Po)Lj,to)dr. 


Then, in accordance with the theory of probability, the ratio 


Giq,p,t; {qo,po)uj,to)dSodr 
F{{qo,Po)uj,to)dSo 


= dF{q,p,t\{qo,Po)uj,to), 


represents the conditional probability that at time t the point corresponding to the state of the 
system is in the element dT around {q,p), if at time to it is anywhere along the paths passing 
through the infinitesimal element dSo around {qo,Po) on the surface S'o(M). Relation (16) then 
proves that the integral of (17) over T satisfies the normalization condition, i.e.. 


f G{q,p,t] (go,po) to) 
Ir F{{qo,po) to) 


dr = 1. 


The conditional probability density D{q,p,t\{qo,po)uj,to) that corresponds to conditional probabil¬ 
ity (17) is defined by the relation 


D{q,p,t\{qo,po)uj,to) = 


G{q,p,t; {qo,Po)u,,to) 

F{{qo,Po)uj,to) 


The relation (17), like the relation (15), represents probability which is conserved in the phase 
space T. The total time derivative of the probability (17), i.e. its time rate of change along 
the Hamiltonian flow lines, is equal to zero. In the relation (17), the path probability density 
F{{qo,Po)u),to) and the surface element dSo are independent of the variables t and {q,p). Also, 
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measure dF is invariant to Hamiltonian motion. Therefore, the total time derivative of the condi¬ 
tional probability (17) is equal to zero if and only if 


dt 


dt ^ ^ \dqi dpi dpi dqi ) 


( 20 ) 


This is a straightforward demonstration that the joint density G{q,p,t; {qo,po)i^,to) satisfies the 
equation analogous to the Liouville equation (4) for the microstate probability density f{q,p,t). 


B. Time dependent Hamiltonian function 


If the Hamiltonian function H = H{q,p,t) and Hamilton’s equations (1) depend on time, 
the subsequent and precedent motion in the phase space T depends on the choice of the initial 
moment of time to- For the initial values {qo,Po) given at time to, Hamilton’s equations and its 
solution (2) uniquely determine the phase space path which passes through (qo,Po) at time to, 
and thus determine the points corresponding to subsequent and precedent values of the dynamical 
variables (q,p). For the same initial values (ftOiPo) given at time tg ^ to, Hamilton’s equations 
and its solution uniquely determine the phase space path which passes through (qo,Po) at time 
tg. If Hamilton’s equations depend on time then these two phase space paths may be different. 
Invariance of Hamilton’s equations to time translations is disrupted and, as a result, phase space 
paths are no longer time independent objects. This means that through the same point in T at 
two different moments of time two different phase space paths may pass. 

This is an important distinction compared to the case of time independent Hamiltonian de¬ 
scribed in the previous subsection. If Hamiltonian function H = H{q,p,t) and Hamilton’s equa¬ 
tions depend on time, unique specification of the phase space path requires the specification of the 
point through which the path passes and also the moment of time at which it is passing through 
that point. Because of this, in this case we can really say that the microstate probability density 
fiQiPit) represents the probability density of paths in the phase space T at time t. Furthermore, 
by comparing (6) and (16), we see that the joint density J^{q,p,t-, qo,Po,to) now has the same 
interpretation that has been given to the joint density G{q,p,t; {qo,Po)uj,to) in the case of time 
independent Hamiltonian function. Accordingly, and in analogy with (17), the expression 


J='{q,p,t;qo,po,to)drodr 

- = aP{q,P,t\qo,Po,to), 


( 21 ) 


f{qo,Po,to)dTo 

is the conditional probability that at time t the point corresponding to the state of the system is in 
the element dF around {q,p), if at time to it is in the element dFo around {qo,Po) and therefore on the 
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paths passing through it. The conditional probability density B{q,p,t\qo,po,to) that corresponds 
to the conditional probability (21) is defined by the relation 


B{q,p,t\qo,po,to) 


B{q,p,t;qo,po,to) 

f{qo,Po,to) 


Using (5), (8) and (22), it is easy to see that 


( 22 ) 


S 

B{q,p,t\qo,po,to) = fj5(gi - qi{t;qo,po))S{pi - Pi{t-,qo,po))- (23) 

i=l 

By demonstration analogous to that which lead to (20), now applying it to the conditional proba¬ 
bility (21), it is simple to show that the joint density T{q,p,t;qo,po,to) also satisfies the Liouville 
equation, i.e. that 


dT 

dt 


^ V ^qi dpi dpi dqi ) 


(24) 


To conclude, if Hamilton’s equations depend on time then phase space paths are no longer properly 
specified only by the points through which they pass, time is also a necessary part of their specih- 
cation. In order to take that into account in a sensible way, we use in that case the joint density 
B{q,p,t] qo,po,to) given by (8), and not the joint density G{q,p,t; {qo,po)co,to) whose definition 
(13) was given for the case of time independent Hamiltonian function. 


IV. INFORMATION ENTROPIES 

In Shannon’s information theory [5] the quantity of the form 

n 

H(pi,...,p„) =-A^Pilogpi, (25) 

i=l 

has a central role of measure of information, choice and uncertainty for different probability dis¬ 
tributions pi,... ,p„. From the understanding that the problem of constructing a communication 
device depends on the statistical structure of the information that is to be communicated (it 
depends for example on the probabilities pi,p 2 , ■ ■ ■ ,Pn of the symbols Ai, A 2 , ■ ■ ■, An of some al¬ 
phabet) Shannon gave until that time most general definition (25) of the measure of amount of 
information. Sequences of symbols or ’’letters” may form the set of ’’words” of certain length, and 
the amount of information is measured analogously. Positive constant K in (25) depends on the 
choice of a unit for amount of information. In real applications expression (25), with logarithmic 
base 2 and K = 1, represents the expected number of bits per symbol necessary to encode the 
random signal forming a memoryless source. But perhaps it is most important that Shannon’s 
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interpretation of the function (25) is not dependent on the specific context of information theory. 
He defined the function (25) as a measure of our uncertainty related to the occurrence of possible 
events, or more specifically, as a measure of uncertainty represented by the probability distribution 
pi,P 2 , ■ ■ ■ ,Pn- This is substantiated by three reasonable properties that are required from such a 
measure H(pi,... ,pn)‘. continuity, monotonic increase with number of possibilities in case when 
all probabilities are equal, and the unique and consistent composition law for the addition of uncer¬ 
tainties when mutually exclusive events are grouped into composite events. These three properties, 
as demonstrated by Shannon in his famous theorem, are sufficient to uniquely determine the form 
of the function H{pi,... ,pn) and it is given by (25). Shannon called the function (25) the entropy 
of the set of probabilities pi^p 2 , ■ ■ ■ ,Pn- 

In an analogous manner Shannon has defined entropy of a continuous distribution and entropy 
of A^-dimensional continuous distribution. Jaynes [6], on the other hand, deduced that the quantity 


Si 


w{x) log 


w{x) 

m{x) 


dx. 


(26) 


corresponds to the quantity ^ discrete probability distribution pi which in a 

limit of infinite number of points tends to continuous distribution with the density function w{x) (in 
such a way that the density of points, divided by their total number, approaches a definite function 
m{x)). Under a change of variables w{x) and m{x) transform in the same way, and the described 
limit process from a discrete to a continuous distribution, with the definition of measure function 
m{x), yields the invariant information measure (26). Invariance of the entropy of a continuous 
distribution under a change of the independent variable is thus achieved with a modification that 
follows from the mathematical deduction conducted by Jaynes, and this is readily generalized to 
the case when a discrete distribution passes to a continuous multidimensional distribution [6]. If 
uniform measure m = const, is assumed, then (26) differs from Shannon’s dehnition of entropy 
of a continuous distribution by an irrelevant additive constant. For example, in the quasiclassical 
limit of quantum statistical mechanics justihcation for this assumption is given by the standard 
proposition that each discrete quantum state corresponds to a volume of the classical phase 
space. 

Shannon [5] has also dehned joint and conditional entropies of a joint distribution of two 
continuous variables (which may themselves be multidimensional). In the previous section, 
joint probability density G{q,p,t] {qo,Po)ui,to) of two continuous multidimensional variables {q,p) 
in r and {qo,Po)u) in So{M) was introduced. Following the detailed explanation of (15), 
G{q,p,t; {qo,po)i^,to)dSodT represents the probability of the joint occurrence of two events: the 
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first occurring at time to on the set Vl{M) of all possible phase space paths and the second oc¬ 
curring at time t on the set M of all possible phase space points, the set which is invariant to 
Hamiltonian motion. As discussed in the previous section, in the case when Hamilton’s equa¬ 
tions depend on time, the same interpretation and role is given to the joint probability density 
Qo,Po,to) of two continuous multidimensional variables in F xF; (qo,Po) which corresponds 
to time to and {q,p) corresponding to time t. 

In accordance with Shannon’s definition [5], joint information entropy of G{q,p,t; {qo,Po)cL),to) 
is given by 

Sfit,to) = - [ [ciogGdTdSo. (27) 

JSo{M) Jr 

The notation Sf{t, to) indicates that it is a function of time t and to, through the joint probability 
density G = G{q,p,t;{qo,Po)u),to)- Following Shannon’s definition [5], conditional information 
entropy of G{q,p,t; {qo,Po)io,to) is then given by 

SF^it,to) = - f /clog 

Jso(M) Jr 

where F = F{{qo,po)ui,to) is the path probability density (16). Using the definition of 
JJ{QjP^'t\iQo,Po)Lo,to) in (19); one immediately obtains the equivalent form of the conditional infor¬ 
mation entropy (28); 

Sf^{t,to) = - [ [ DFlogD dTdSo. (29) 

Jso{M) Jr 

From (29) it is clear that the conditional information entropy Sf^^{t, to) is the average of the entropy 
of D{q,p,t\{qo,Po)Lo,to), weighted over all possible phase space paths to G H(M) according to the 
path probability density F{{qo,po)uj, to)- Definitions of the joint Sf {t, to) and conditional Sf^{t, to) 
information entropies of the joint distribution with the density function F{q, p, t; qo,Po,to) are anal¬ 
ogous to the definitions (27), (28) and (29). There is no need to also write them here explicitly; they 
are readily obtained from (27), (28) and (29), by changing the symbols with corresponding mean¬ 
ings as explained in the previous section: replace G{q,p,t;{qo,Po)u),to) with F{q,p,t;qo,Po,to), 
F{{qo,Po)Lo,to) with f{qo,Po,to), D{q,p,t\{qo,po)uj,to) with B{q,p,t\qo,po,to), M and So{M) with 
F, and dSo with dFg. 

Relation between the information entropies Sf{t,to) and Sf^{t,to), introduced in (27) and 
(28), is completed by introducing the information entropy of F{{qo,po)uj,to), or alternatively, path 
information entropy: 

Sf{to) = - [ FlogFdSo- (30) 

JSo{M) 


dVdSo, 


(28) 
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Relation between Sf{t,tQ), SP^{t,to) and Sf (to) is obtained straightforwardly, using (19) in (27), 
and then applying the properties of probability distributions and definition (29). In this way one 
obtains 

(t, to) = ^(t, to) + Sf (to). (31) 

In accordance with [5], relation (31) asserts that the uncertainty (or entropy) of the joint event 
is equal to the uncertainty of the first plus the uncertainty of the second event when the first is 
known. To be mathematically precise, uncertainty of a joint event means here the uncertainty of 
two random variables which are defined on the space of elementary events of the same probability 
space. Uncertainty of individual events is the uncertainty of these individual random variables. 

In general, uncertainty of the joint event is less then or equal to the sum of individual un¬ 
certainties, with the equality if (and only if) the two random variables are independent [5]. The 
probability distribution of the joint event is given here by the density G{q,p,t; {qo,Po)cL),to). Infor¬ 
mation entropy or uncertainty of one of them (in this case called the second event because of its 
occurrence at a later time) is given by 

S{{t) = - J^f log f dT. (32) 

The quantity Sj{t) is the information entropy of the microstate probability distribution whose 
density function is f{q,p,t), or in short, information entropy. The uncertainty of the first event is 
given by the path information entropy Sf{tQ) defined in (30). For Sf{t,tQ), Sj{t) and Sf {to) the 
aforementioned property of information entropies is given here by the following relation: 

Sf{t,to)<S{it) + Sf{to), (33) 

with the equality if (and only if) the two random variables defining the individual events are 
independent. Furthermore, from (31) and (33), one obtains an important relation between the 
information entropy Sj{t) and the conditional information entropy Sj^^{t,to): 

S{{t)>Sf^^{t,to), (34) 

with the equality if (and only if) the two random variables defining the individual events are 
independent. In the case when Hamilton’s equations depend on time, our analysis is based on the 
joint density iF{q,p, t; qo,Po, to)- Following the same argumentation leading to (34), one obtains the 
relation between the information entropy Sj{t) and the conditional information entropy Sf^{t,to): 

S{{t) > Sf -^{t,to), (35) 
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with the equality only in the case of independence of the two random variables. 

In terms of probability, the events occurring at time to on the set 0,(M) of all possible phase 
space paths and at any time t on the set M C T of all possible phase space points, are not 
independent. If we assume that at given initial time t = to the values of joint probability density 
G{q,p,t; {qoiPo)u),to) are physically well defined (in the sense of (15)) for all points {q,p) G F and 
(qo,Po) G So(M), then its values are determined at all times t in the entire phase space F via 
the Liouville equation (20). Simple deduction leads to the conclusion that maximization of the 
conditional information entropy Sf^(t,to), subject to the constraints of Liouville equation (20) 
and normalization, can not attain the upper bound which is given (at any time t) by the value 
of the information entropy Sj(t) in (34). Attaining this upper bound would require statistical 
independence, which would have as its logical consequence a complete loss of correlation between 
the paths in the set fi(M) of possible phase space paths at time to and the points in the set M C F 
of possible phase space points at time t. Statistical independence is precluded at any time t by the 
constraint implied by the Liouville equation (20), and the requirement that the joint probability 
density G{q,p,t; {qo,Po)uj,to) is well defined. 

If the conditional information entropy Sf^{t,to) is maximized subject to the constraints of 
Liouville equation (24) for the joint probability density T{q,p,t;qo,po,to) and normalization, by 
similar deduction the same conclusion is obtained for Sf^{t,to) and its upper bound given by 
(35). Furthermore, statistical independence between phase space points at time to and t implies 
statistical independence between phase space paths at time to and phase space points at time t. 
The converse, on the other hand, is not always true. A phase space path consists of infinitely 
many points. In the case of time independent Hamiltonian function phase space path is specified 
uniquely by all these points independently of time. In that case, therefore, statistical independence 
between phase space paths at time to and phase space points at time t is not sufficient for the 
statistical independence between points at time to and t. 

V. A MODEL FOR A CLOSED HAMILTONIAN SYSTEM 

At this point it is helpful to make a distinction between two different aspects of time evolution. 
The first is a microscopic aspect which represents a problem of dynamics implied in this work by 
Hamilton’s equations. The solutions are represented in F as phase space paths. Predicting macro¬ 
scopic time evolution represents a problem of available information and inferences from that partial 
information. Therefore, along with the microscopic state which is never known completely, micro- 
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scopic dynamics and the respective phase space paths are also part of this problem of incomplete 
information. In the case of macroscopic system information about microscopic dynamics is very 
likely to be incomplete for variety of different possible reasons. Some of them will be analyzed in 
the subsequent paper [1]. However, in the absence of more complete knowledge, Hamilton’s equa¬ 
tions (1) and the set of possible phase space paths are the representation of our prior information 
about microscopic dynamics. It is natural to assume that the macroscopic time evolution which 
we are trying to predict is consistent with our knowledge of microscopic dynamics, even when this 
knowledge is not complete. 

All arguments mentioned before lead to the conclusion that regarding Liouville equation (20) as 
a strict microscopic constraint on time evolution in terms of prediction is equivalent to having com¬ 
plete information about microscopic dynamics. Following the previously introduced assumptions, 
the Liouville equation (20) can also be regarded as a macroscopic constraint on time evolution. If 
our information about microscopic dynamics is not sufficiently detailed to completely determine 
the time evolution, an average is taken over all cases which are possible on the basis of partial 
information. In predictive statistical mechanics formulated by Jaynes, inferences are drawn from 
probability distributions whose sample spaces represent what is known about the structure of 
microstates, and maximize information entropy subject to the available macroscopic data as con¬ 
straints [9]. In this way “objectivity” of probability assignments and predictions is ensured from 
introducing additional assumptions which are not necessarily contained in the available data. In 
the simple model developed in [25] we have introduced the same basic idea into stage (2) (ex¬ 
plained in Section I) of the problem of prediction for closed Hamiltonian systems. The conditional 
information entropy SP^{t,to) is maximized subject to the constraint of Liouville equation (20), 
introduced as a phase space average, or more precisely, an integral over phase space similarly to 
other macroscopic constraints. This approach allowed us to consider the incomplete nature of 
our information about microscopic dynamics in a rational way, and leads to the loss of correla¬ 
tion between the initial phase space paths and final microstates and to corresponding uncertainty 
in prediction. The conditional information entropy SP^{t,to) is the measure of this uncertainty, 
related to loss of information about the state of the system. 

Now we present very briefly the basic theoretical model of the macroscopic time evolution of 
closed Hamiltonian systems which is the basis for further generalizations that will be introduced in 
the subsequent paper [Ij. Details of this model were partially presented in [25]. In the first approach 
to this basic model, time evolution of the conditional probability density D{q,p,t\{qo,po)u),to) in 
the interval to < t < ta is determined from the maximization of the conditional information entropy 
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Sj^^{t,tQ) under the following two constraints: normalization condition 


IM 


Diq,p,t\{qo,po)^,to)dT = 1, 


and the Liouville equation for D{q,p,t\{qo,po)co,to), 


dD 


uiy / uiy uii 

Qt V Pin: Pin: 


i=l 


dD dH dD dH 


dqi dpi dpi dqi 


= 0 . 


(36) 


(37) 


From (19) it follows that the constraints given by (20) and (37) are equivalent. By definition, the set 
of all possible microstates M C F is an invariant set. The normalization constraint (36) contains in¬ 
formation about the structure of possible microstates in F, in the time interval under consideration 
to <t < ta- Information about microscopic dynamics is represented by Hamilton’s equations (1) 
and the set Q{M) of possible phase space paths in F. In addition, this information is also contained 
in the Liouville equation (37). By introducing the Liouville equation for D{q,p,t\{qo,po)^,to) as 
a strict microscopic constraint (37) the time evolution is completely determined by this equation 
and initial conditions. Maximization of the conditional information entropy SP^{t,to) subject to 
this constraint and the normalization is therefore equivalent to solving the Liouville equation for 
D{q,p,t\{qo,Po)u),to) that maximizes it. This approach was introduced in [25] to prove the con¬ 
sistency of the basic model and therefore will not be exposed further in the current paper. As 
was already explained in Sect. IV, for any physically well defined conditional probability density 
D{q,p,t\{qo,Po)uj 7 to) (in the sense of (17) and (19)), the upper bound on given by (34), 

is not attained in the maximization under constraints (36) and (37). In this approach to the basic 
theoretical model there is no statistical independence between the initial phase space paths and 
final microstates. Furthermore, the value of SP^{t,to) is constant during the time interval under 
consideration to < t < ta and there is no loss of information about the state of the system. 

The conclusions which follow from the interpretation of relation (34) and the property of 
Sf^{t,to) as a measure of uncertainty related to loss of information were argumented in Sect. 
IV. In the second approach (given also in [25]) to this basic model, these conclusions are taken 
into account by replacing the strict equality constraint (37) with the constraint in the form of the 
integral over phase space. 


^ 2 {{qo,Po)uj,to;t,D) 



dD dH 
dqi dpi 


dD dH 
dpi dqi 


F dF = 0. 


(38) 


The normalization constraint (36) is writen here in equivalent but more suitable form: 


P’iiiQo,Po)uj,to]t,D) = F [ DdT-F = 0. 

JM 


(39) 
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Time derivative of the conditional information entropy Sj (t,to) given by (29) is equal to 


dSP^{t,to) 


dt 


ISo{M) Jm 


dD 


FlogD dVdSo- 


ISo(M) Jm 


dD 


F drdSo. 


(40) 


Because of the normalization (36), the last term in (40) is equal to zero. At time ta conditional 
information entropy is given by the expression, 


SP^{ta,to) = - / “ / / ^FlogD dVdSodt + 5f^(to,to). (41) 

Jto Jso{M) Jm ot 

It is suitable to form the following functional 

(■ta c r 

(42) 

Jto JSo{M)JM 

with the function K{D,dtD) given by 

f)r) 

(43) 


J[D] = to) - sp^ito, to)= / / KiD, dtD)dTdSodt, 

Jto Jso{M) Jm 


K{D,dtD) = -—FlogD. 


In the variational problem which is considered here, functional J[D] in (42) is rendered sta¬ 
tionary with respect to variations subject to the constraints (38) and (39). The prescribed 
D{q,p,t\{qo,Po)uj,to) at initial time to must be physically well defined in the sense of (17) and 
(19). In this variational problem, function D{q,p,t\{qo,po)oj,to) is not required to take on pre¬ 
scribed values on the remaining portion of the boundary of integration region M x {to,ta) in (41). 

Methods for variational problems with this type of constraints exist and one can develop them 
and apply in practical problems [26]. Here, in the notation which is adapted to this particular 
problem, the following functionals are introduced; 


Ci[D,X(] 


ISoiM) 



XiPi dtdSo, 


(44) 


and 


f [ ^2P2 dtdSo- (45) 

So{M) Jto 

The Lagrange multipliers Ai = Xi{{qo,Po)ui,to;t) and A 2 = X 2 {{qo,Po)oj,to]t) are functions defined 
in the integration regions in (44) and (45). For any function with continuous first partial derivatives, 
Euler equation for the constraint ip 2 = 'p 2 {{qo,Po)u),to',t, D) is equal to zero. Following the most 
general multiplier rule for this type of problems which is explained in detail in ref. [26], we introduce 
an additional constant Lagrange multiplier Aq for the function K, 



J[D,Xo] 



XoK{D, dtD) dVdSodt. 


(46) 
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The functional /[D, Aq, Ai, A2] is formed from J[D,Xo], C'i[-D,Ai] and C'2[D,A2]: 


/[Z),Ao,Ai,A 2 ] = J[T>,Ao]-Ci[Z),Ai]-C 2 [Z),A 2 ]. (47) 


The existence of Lagrange multipliers Aq / 0, and Ai, A 2 not all equal to zero, such that the 
variation of I[D, Aq, Ai, A 2 ] is stationary 51 = 0, represents a proof that it is possible to make J[D] 
in (42) stationary subject to constraints (38) and (39). For a function D{q,p,t\{qo,po)co,to) to 
maximize Sf^^{ta,to) subject to constraints (38) and (39), it is necessary that it satisfies the Euler 
equation: 


. \ dK d 

° 1 dD dt 


dK 

d{dtD) 


E 

i=l 


dqi 


dK 


+ 


d 


d{dq.D)J dpi \d{dp,D) 


dK 


-AiF + ^F = 0. (48) 
dt 


It is easy to check that the term multiplied by Aq in Euler equation (48) is equal to zero. Stationarity 
of the functional I[D, Aq, Ai, A 2 ] in (47) is therefore possible even with Aq 7 ^ 0. From (48) it follows 
that the Lagrange multipliers Ai and A 2 satisfy the equation 


dt 


(49) 


Another necessary condition for a maximum, in addition to (48), exists if function 
D{q,p,t\{qo,po)co,to) is not required to take on prescribed values on a portion of the boundary 
of M X {to,ta)- then, it is necessary that D{q,p,t\{qo,po)co,to) also satisfies the Euler boundary 
condition on the portion of the boundary of M x {tQ,ta) where its values are not prescribed, ref. 
[26]. In accordance with this, for all points on the portion of the boundary of M x (toTa) where 
t = ta, the Euler boundary condition gives: 

= - [log D + \ 2 ]t=t, F = 0. (50) 

t=ta 

For all points on the portion of the boundary of M x {to,ta) where time t is in the interval 
to < t < ta, the Euler boundary condition gives: 


dK 

d{dtD) 


-X2F 


F [A 2 V • n] boundary of M (^^) 

In (51), V • n is a scalar product of the velocity field v{q,p) in F (defined in Sect. II) and the 
unit normal n of the boundary surface of invariant set M, taken at the surface. Equation (51) 
is satisfied naturally due to Hamiltonian motion, since the set M is invariant by definition, and 
therefore v • n = 0 for all points on the boundary surface of M. This is a consequence of the fact 
that phase space paths do not cross over the boundary surface of the invariant set M. 


19 



The form of the MaxEnt conditional probability density at time ta follows from (50); 

D{q,p,ta\{qo,Po)u,,to) = exp [-A 2 ((go,Po)a;To; ^a)] • (52) 

For any, at initial time to well defined conditional probability density, there is an entire class 
of equally probable solutions {D{q,p,t\{qo,Po)Lo,to)} obtained by MaxEnt algorithm, which all 
satisfy the macroscopic constraint (38). At time ta, all functions in this class of MaxEnt solutions 
are equal and given by (52). With the exception of times to and ta, the conditional probability 
density D{q,p,t\{qo,po)a),to) obtained by MaxEnt algorithm is not uniquely determined in the 
interval to < t < ta- This is a consequence of the fact that the macroscopic constraint (38) 
does not determine the time evolution of D{q,p,t\{qo,Po)Lo,to) uniquely, in the way that the strict 
microscopic constraint (37) does. However, MaxEnt solutions still predict only time evolutions 
entirely within the invariant set M, due to (51). This property follows from the constraint (38), 
and takes into account the information about the constants of motion that determine the invariant 
set M, and in that way, about related conservation laws. 

From the normalization (36) of the conditional probability density, given at time ta by (52), 
one obtains the relation: 


W{M)ex.p[-X2{{qo,Po)uj,to;ta)] = 1, (53) 

where W{M) is the measure, i.e., phase space volume of the invariant set M. Equation (53) 
implies that the Lagrange multiplier X2{{qo,Po)Lo,to', t) at time t = ta 'is independent of the variables 
{qo,Po)Lo- 

>^ 2 {iqO,Po)io,to;ta) = X 2 ita). (54) 

Microstate probability density f{q,p,t) at time t = ta is then calculated by using; (14) and (19), 
the MaxEnt conditional probability density D{q,p,t\{qo,Po)Lo,to) at time t = ta given by (52) and 
(54), and the path probability distribution F{{qo,po)u),to) at initial time to- 

f{q,p, to) = exp [-X2{ta)] - (55) 

It follows from (52-55) that at time ta, the MaxEnt conditional probability density and the corre¬ 
sponding microstate probability density are equal, 

D{q,P,ta\{qo,Po)uj,to) = f{q,P,ta) = exp [-A2(ta)] = (^ 6 ) 
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From (29), (32) and (56), one obtains the values of information entropies SP^{t,tQ) and Sj{t) at 
time ta, 

sjita) = SP^ita,to) = logfF(M). (57) 

Equalities (56) and (57) are possible only in case of statistical independence. Logical conse¬ 
quence of the statistical independence is the complete loss of correlation between the phase space 
paths at time to, and the microstates at time ta- In general, property of macroscopic systems is 
that they appear to randomize themselves between observations, provided that the observations 
follow each other by a time interval longer then a certain characteristic time r called the relaxation 
time [27]. In the interpretation given here, relaxation time r for a closed Hamiltonian system 
represents a characteristic time required for the described loss of correlation between the initial 
phase space paths and final microstates. Furthermore, r also represents a time interval during 
which predictions, based on incomplete information about microscopic dynamics, become uncer¬ 
tain to a maximum extent compatible with the available data. This uncertainty is related to loss 
of information about the state of the system. 

This interpretation is reflected in the role of the Lagrange multipliers Ai((go,Po)a;To; ^) and 
M{iQo,Po)Lo,to; t). They are required to satisfy (49), and by integrating it one obtains the following 
relation, 

^■ 2 {{qo,Po)uj,to;t) = / \i{{qo,Po)uj,to;t')dt'+ \ 2 {{qo,Po)^,to;to), (58) 

Jto 

for all t in the interval to < t < ta- By using (58), with (53), (54) and (57), one obtains 

S{{ta) = SP^{ta,to) = logW{M) = f Xi{{qo,po)uj,to-,t)dt + X 2 {{qo,Po)uj,to;to)- (59) 

Jto 

It is clear, from relations (54), (58) and (59), that at time ta the Lagrange multiplier 
X 2 {iqo,Po)uj,to]ta) = X 2 {ta) is determined by the measure W{M) of the invariant set M of all 
possible microstates, i.e., the volume of accessible phase space. The subsequent application of 
MaxEnt algorithm of the described type for a closed system with Hamiltonian dynamics, without 
the introduction of additional constraints, results in the increase of W{M)- From (54), (58) and 
(59) it is then deduced that X 2 {ta) > X 2 {to)- 

Information about the structure of possible microstates restricts the corresponding set, and 
therefore sets an upper bound on the volume of accessible phase space. The values of Sf^{ta,to) 
and Sj{ta) at time ta, given in (59), are equal to the maximum value of the Boltzmann-Gibbs 
entropy, compatible with this information. The Lagrange multiplier Xi{{qo,Po)ui,to',t), integrated 
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in (59) over time to < t < ta, is then determined by the rate at which the maximum Boltzmann- 
Gibbs entropy is attained in a reproducible time evolution. The integral in (59), and the quantity 
M{iQo,Po)Lo,to]t), can be identified with the change in entropy, and the rate of entropy ehange 
for a closed Hamiltonian system, respectively. If the information about microscopic dynamics of 
a closed Hamiltonian system is considered complete, whether entropy production can be defined 
without recourse to coarse graining procedures, or macroscopic, phenomenological approaches, 
remains an open question. In general, part of information is discarded in all such models, at 
some stage, in order to match with what is observed in nature in various manifestations of the 
second law of thermodynamics. The model developed in [25] corresponds to a closed system with 
the time independent Hamiltonian function. The model presented in the current paper includes 
also closed systems with Hamiltonian function that depends on time, and the same conclusions 
are obtained analogously. In that case, model is modified by a simple change of the symbols 
with corresponding meanings, as explained in Sections HI and IV: replace T((go,Po)a;To) with 
fiqo,Po,to), D{q,p,t\{qo,po)^,to) with B{q,p,t\qo,Po,to), M and So{M) with T, dSo with dTo, 
and SP^{t,to) with Sf^{t,to). 

VI. CONCLUSION 

It is demonstrated that Jaynes’ interpretation of irreversibility as a consequence of a gradual loss 
of information as to the state of the system due to our inability to follow its exact time evolution 
during the process [3], has a clear mathematical formulation in the concepts which are introduced 
in this paper. The most important theoretical concept in this work was the maximization of the 
conditional information entropy subject to given constraints, and its relation with the information 
entropy, taken from Shannon’s information theory. At a same time, the key element of this theoret¬ 
ical approach was the introduction of Liouville equation for the conditional probability distribution 
as a macroscopic constraint, i.e., as a constraint given by averaging this equation in the integral 
over the available phase space. In this way, in the problem of predicting the macroscopic time 
evolution of closed Hamiltonian systems, the incompleteness of our information about the detailed 
microscopic dynamics of the system is included, in a way which is consistent with the foundational 
principles of predictive statistical mechanics. It is demonstrated that such mathematical descrip¬ 
tion results in a total loss of correlation between the initial phase space paths and final microstates. 
This loss of correlation is related to a loss of information about possible microstates of the system, 
which is brought into connection with the change of entropy of the system. This connection al- 
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lowed the definition of the entropy change and the rate of entropy change for a closed Hamiltonian 
system without additional assumptions. In the subsequent paper [1], we show, by generalizing this 
approach and including, as the additional constraints, the relevant information for prediction of 
macroscopic time evolution on the hydrodynamic time scale, that it is consistent with the known 
results of the nonequilibrium statistical mechanics and thermodynamics of irreversible processes. 


[1] Kuic, D: Predictive statistical mechanics and macroscopic time evolution. Hydrodynamics and entropy 
production. Accessible via http://arxiv.org/abs/1506.02625 

[2] Jaynes, E.T.: Information theory and statistical mechanics. Phys. Rev. 106, 620-630 (1957) 

[3] Jaynes, E.T.: Information theory and statistical mechanics. II. Phys. Rev. 108, 171-190 (1957) 

[4] Gibbs, J.W.: Elementary Principles in Statistical Mechanics. Yale University Press, New Haven (1902) 

[5] Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379-423, 623-656 
(1948). Reprinted In: Shannon C.E., Weaver, W.: The Mathematical Theory of Communication. 
University of Illinois Press, Urbana (1949) 

[6] Jaynes, E.T.: Information theory and statistical mechanics. In: Eord, K.W. (ed.) 1962 Brandeis Lectures 
in Theoretical Physics, vol. 3, pp. 181-218. W. A. Benjamin, Inc., New York (1963) 

[7] Jaynes, E.T.: Where do we stand on maximum entropy? In: Levine, R.D., Tribus, M. (eds.) The 
Maximum Entropy Formalism, pp. 15-118. MIT Press, Cambridge (1979) 

[8] Jaynes, E.T.: Gibbs vs Boltzmann entropies. Am. J. Phys. 33, 391-398 (1965) 

[9] Jaynes, E.T.: The minimum entropy production principle. Ann. Rev. Phys. Chem. 31, 579-601 (1980) 

[10] Jaynes, E.T.: Macroscopic prediction. In: Haken, H. (ed.) Complex Systems - Operational Approaches 
in Neurobiology, Physics, and Computers, pp. 254-269. Springer, Berlin (1985) 

[11] Grandy, W.T.: Principle of maximum entropy and irreversible processes. Phys. Rep. 62, 175-266 (1980) 

[12] Jaynes, E.T.: The second law as physical fact and as human inference. Unpublished manuscript (1990). 
http: / /bayes. wustl. edu/etj /node2. html 

[13] Harris, R., Hurley, J., Garrod, C.: Nonequilibrium ensemble dynamics. Phys. Rev. A 35, 1350-1359, 
(1987) 

[14] Garrod, C.: Statistical Mechanics and Thermodynamics. Oxford University Press, New York (1995) 

[15] Zurek, W.H.: Algorithmic randomness and physical entropy. Phys. Rev. A 40, 4731-4751 (1989) 

[16] Grandy, W.T.: Time evolution in macroscopic systems. 1. Equations of motion. Found. Phys. 34, 1-20 
(2004) 

[17] Grandy, W.T.: Time evolution in macroscopic systems. H. The entropy. Found. Phys. 34, 21-57 (2004) 

[18] Grandy, W.T.: Time evolution in macroscopic systems. HI. Selected applications. Found. Phys. 34, 
771-813 (2004) 

[19] Grandy, W.T.: Entropy and the Time Evolution of Macroscopic Systems. Oxford University Press, 


23 



Oxford (2008) 

[20] Tishby, N.Z., Levine, R.D.: Time evolution via a self-consistent maximal-entropy propagation: the 
reversible case. Phys. Rev. A 30, 1477-1490 (1984) 

[21] Plastino, A.R., Plastino, A.: Statistical treatment of autonomous systems with divergenceless flows. 
Physica A 232, 458-476 (1996) 

[22] Plastino, A., Plastino, A.R., Miller, H.G.: Continuity equations, H-theorems, and maximum entropy. 
Phys. Lett. A 232, 349-355 (1997) 

[23] Plastino, A.R., Plastino, A.: Universality of Jaynes’ approach to the evolution of time-dependent 
probability distributions. Physica A 258, 429-445 (1998) 

[24] Schonfeldt, J-H., Jimenez, N., Plastino, A.R., Plastino, A., Casas, M.: Maximum entropy principle and 
classical evolution equations with source terms. Physica A 374, 573-584 (2007) 

[25] Kuic, D., Zupanovic, P., Juretic, D.: Macroscopic time evolution and MaxEnt inference for closed 
systems with Hamiltonian dynamics. Found. Phys. 42, 319-339 (2012) 

[26] Wan, F.Y.M.: Introduction to the Calculus of Variations and Its Applications. Chapman & Hall, New 
York (1995) 

[27] Kittel, C.: Elementary Statistical Physics. Wiley, New York (1958) 


24 



